599 research outputs found

    Discovery of large genomic inversions using long range information.

    Get PDF
    BackgroundAlthough many algorithms are now available that aim to characterize different classes of structural variation, discovery of balanced rearrangements such as inversions remains an open problem. This is mainly due to the fact that breakpoints of such events typically lie within segmental duplications or common repeats, which reduces the mappability of short reads. The algorithms developed within the 1000 Genomes Project to identify inversions are limited to relatively short inversions, and there are currently no available algorithms to discover large inversions using high throughput sequencing technologies.ResultsHere we propose a novel algorithm, VALOR, to discover large inversions using new sequencing methods that provide long range information such as 10X Genomics linked-read sequencing, pooled clone sequencing, or other similar technologies that we commonly refer to as long range sequencing. We demonstrate the utility of VALOR using both pooled clone sequencing and 10X Genomics linked-read sequencing generated from the genome of an individual from the HapMap project (NA12878). We also provide a comprehensive comparison of VALOR against several state-of-the-art structural variation discovery algorithms that use whole genome shotgun sequencing data.ConclusionsIn this paper, we show that VALOR is able to accurately discover all previously identified and experimentally validated large inversions in the same genome with a low false discovery rate. Using VALOR, we also predicted a novel inversion, which we validated using fluorescent in situ hybridization. VALOR is available at https://github.com/BilkentCompGen/VALOR

    Symmetry of Energy Divergence Anomalies Associated with the El Niño-Southern Oscillation

    Get PDF
    The El Niño-Southern Oscillation (ENSO) is a dominant source of global climate variability. The effects of this phenomenon alter the flow of heat from tropical to polar latitudes, resulting in weather and climate anomalies that are difficult to forecast. The current work quantified two components of the vertically integrated equation for the total energy content of an atmospheric column, to show the anomalous horizontal redistribution of surface heat flux anomalies. Symmetric and asymmetric components of the vertically integrated latent and sensible heat flux divergence were quantified using ERA-Interim atmospheric reanalysis output on 30 model layers between 1979 and 2016. Results indicate that asymmetry is a fundamental component of ENSO-induced weather and climate anomalies at the global scale, challenging the common assumption that each phase of ENSO is equal and opposite. In particular, a substantial asymmetric component was identified in the relationship between ENSO and patterns of extratropical climate variability that may be proportional to differences in sea surface temperature anomalies during each phase of ENSO. This work advances our understanding of the global distributions of source and sink regions, which may improve future predictions of ENSO-induced precipitation and surface temperature anomalies. Future studies should apply these methods to advance understanding and to validate predictions of ENSO-induced weather and climate anomalies

    Evolutionary-new centromeres preferentially emerge within gene deserts

    Get PDF
    A study identifying genomic restructuring and the absence of genes as conditions permissive for the seeding of new centromeres in primate

    The genomic distribution of intraspecific and interspecific sequence divergence of human segmental duplications relative to human/chimpanzee chromosomal rearrangements

    Get PDF
    Background: It has been suggested that chromosomal rearrangements harbor the molecular footprint of the biological phenomena which they induce, in the form, for instance, of changes in the sequence divergence rates of linked genes. So far, all the studies of these potential associations have focused on the relationship between structural changes and the rates of evolution of singlecopy DNA and have tried to exclude segmental duplications (SDs). This is paradoxical, since SDs are one of the primary forces driving the evolution of structure and function in our genomes and have been linked not only with novel genes acquiring new functions, but also with overall higher DNA sequence divergence and major chromosomal rearrangements. Results: Here we take the opposite view and focus on SDs. We analyze several of the features of SDs, including the rates of intraspecific divergence between paralogous copies of human SDs and of interspecific divergence between human SDs and chimpanzee DNA. We study how divergence measures relate to chromosomal rearrangements, while considering other factors that affect evolutionary rates in single copy DNA. Conclusion: We find that interspecific SD divergence behaves similarly to divergence of singlecopy DNA. In contrast, old and recent paralogous copies of SDs do present different patterns of intraspecific divergence. Also, we show that some relatively recent SDs accumulate in regions that carry inversions in sister lineages.This research was supported by a grant to A.N. from the Ministerio de Ciencia y Tecnologia (Spain, BFU2006 15413-C02-01) and by BE2005 and BP2006 fellowships to T.M.B from the "Departament d'Educacio i Universitats de la Generalitat de Catalunya"

    Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

    Get PDF
    The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%–5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution

    Population Stratification of a Common APOBEC Gene Deletion Polymorphism

    Get PDF
    The APOBEC3 gene family plays a role in innate cellular immunity inhibiting retroviral infection, hepatitis B virus propagation, and the retrotransposition of endogenous elements. We present a detailed sequence and population genetic analysis of a 29.5-kb common human deletion polymorphism that removes the APOBEC3B gene. We developed a PCR-based genotyping assay, characterized 1,277 human diversity samples, and found that the frequency of the deletion allele varies significantly among major continental groups (global F (ST) = 0.2843). The deletion is rare in Africans and Europeans (frequency of 0.9% and 6%), more common in East Asians and Amerindians (36.9% and 57.7%), and almost fixed in Oceanic populations (92.9%). Despite a worldwide frequency of 22.5%, analysis of data from the International HapMap Project reveals that no single existing tag single nucleotide polymorphism may serve as a surrogate for the deletion variant, emphasizing that without careful analysis its phenotypic impact may be overlooked in association studies. Application of haplotype-based tests for selection revealed potential pitfalls in the direct application of existing methods to the analysis of genomic structural variation. These data emphasize the importance of directly genotyping structural variation in association studies and of accurately resolving variant breakpoints before proceeding with more detailed population-genetic analysis
    corecore